Week 15: Masters of Your Own Destiny

Dr. T. Kody Frey

Assistant Professor | School of Information Science

Housekeeping

Overview

  • Housekeeping
  • Getting Data from Qualtrics
  • Spring Cleaning!
  • Data Reduction: EFAs
  • Discussion

Project Timeline

  • Only 2 weeks left! Everyone is approved and all surveys have launched.
  • Group should be working on front end (literature review, hypotheses, methods)
  • Final Report must include:
    • APA format
    • Introduction, Rationalize, Lit Review, RQs/Hs, Methods, Results, Discussion (Implications, Limitations, Future Research), Conclusion, References, Any Relevant Appendices
  • Final Conference-Style Presentation must include:
    • Visual aid
    • 15 minute max
    • Each member should speak
    • Brief Q&A at the end

Help Sessions This Week

What’s Next?

Where we’re at

Your Task?

  • Download and clean your data
  • Run your EFA
  • Run your tests (time permitting)

Downloading your data from Qualtrics

Data & Analysis

Export & Import

Export Data

Download SPSS File

Experimental Conditions

Running Your EFA

What is an EFA?

Step 1: Get Packages

Install necessary necessary packages:

install.packages('foreign')
install.packages('psych')
install.packages('corrplot')
install.packages('ggplot2')

Step 2: Load ’Em Up

library(foreign)
library(psych)
library(corrplot)
library(ggplot2)

Step 3: Pull In Your Data Using Foreign

DATA <- read.spss("/Users/kodyfrey/Desktop/CI 665/Week 15/SampleData_SpedUp_665_Prepped.sav", use.value.label=FALSE, to.data.frame=TRUE)

describe(DATA)

CogLearning=na.omit(subset(DATA, select= c(CLearn_1, CLearn_2, CLearn_3, CLearn_4, CLearn_5_R, CLearn_6, CLearn_7_R, CLearn_8_R, CLearn_9, CLearn_10_R)))

Step 4: Correlation Matrix

round(cor(CogLearning, use="complete.obs"),2)
cor_matrix_cog <- round(cor(CogLearning, use="complete.obs"),2)
Code
## Plots correlations in order
corrplot(cor(CogLearning, use="complete.obs"), order = "original", tl.col='black', tl.cex=.75) 

Code
## Groups correlations by clusters
corrplot(cor(CogLearning, use="complete.obs"), order = "hclust", tl.col='black', tl.cex=.75) 

Step 5: Factorability

## Want values above .60 and significant

cortest.bartlett(CogLearning, n = nrow(CogLearning),diag=TRUE)

$chisq [1] 970.8732

$p.value [1] 3.732124e-174

$df [1] 45

KMO(cor_matrix_cog)

Kaiser-Meyer-Olkin factor adequacy Call: KMO(r = cor_matrix_cog) Overall MSA = 0.87 MSA for each item = CLearn_1 CLearn_2 CLearn_3 CLearn_4 CLearn_5_R CLearn_6 0.90 0.88 0.89 0.88 0.91 0.86 CLearn_7_R CLearn_8_R CLearn_9 CLearn_10_R 0.83 0.82 0.86 0.86

Scree plot helps determine number of factors to extract

Locate the ‘elbow’ of the plot; where does it begin to flatten?

Code
scree(CogLearning)

The PA tests your data against simulated data to determine most likely number of factors

Code
parallel=fa.parallel(CogLearning, fm="pa", fa="fa")

Parallel analysis suggests that the number of factors =  2  and the number of components =  NA 
Code
sum(parallel$fa.values > 1.0)
[1] 2
Code
sum(parallel$fa.values > .7)
[1] 2

Step 6: Compare!

Scree and PA will recommend a number of factors. Place this value in ‘nfactors=X’

Run model with ‘nfactors=X-1’ and ‘nfactors=X+1’ for comparison.

Code
CL_1 = fa(CogLearning, nfactors=1, rotate = "oblimin", fm="mle")
print(CL_1, digits = 4)
Factor Analysis using method =  ml
Call: fa(r = CogLearning, nfactors = 1, rotate = "oblimin", fm = "mle")
Standardized loadings (pattern matrix) based upon correlation matrix
               ML1     h2     u2 com
CLearn_1    0.7954 0.6326 0.3674   1
CLearn_2    0.6013 0.3616 0.6384   1
CLearn_3    0.7780 0.6053 0.3947   1
CLearn_4    0.7407 0.5486 0.4514   1
CLearn_5_R  0.5665 0.3209 0.6791   1
CLearn_6    0.5517 0.3044 0.6956   1
CLearn_7_R  0.6036 0.3643 0.6357   1
CLearn_8_R  0.6384 0.4076 0.5924   1
CLearn_9    0.6185 0.3826 0.6174   1
CLearn_10_R 0.4926 0.2427 0.7573   1

                  ML1
SS loadings    4.1705
Proportion Var 0.4171

Mean item complexity =  1
Test of the hypothesis that 1 factor is sufficient.

The degrees of freedom for the null model are  45  and the objective function was  4.9075 with Chi Square of  970.8732
The degrees of freedom for the model are 35  and the objective function was  1.4593 

The root mean square of the residuals (RMSR) is  0.1302 
The df corrected root mean square of the residuals is  0.1476 

The harmonic number of observations is  203 with the empirical chi square  309.6877  with prob <  1.003e-45 
The total number of observations was  203  with Likelihood Chi Square =  287.7221  with prob <  1.769e-41 

Tucker Lewis Index of factoring reliability =  0.64781
RMSEA index =  0.18853  and the 90 % confidence intervals are  0.16917 0.2096
BIC =  101.7599
Fit based upon off diagonal values = 0.9106
Measures of factor score adequacy             
                                                     ML1
Correlation of (regression) scores with factors   0.9438
Multiple R square of scores with factors          0.8907
Minimum correlation of possible factor scores     0.7814
Code
CL_2 = fa(CogLearning, nfactors=2, rotate = "oblimin", fm="mle")
print(CL_2, digits = 4)
Factor Analysis using method =  ml
Call: fa(r = CogLearning, nfactors = 2, rotate = "oblimin", fm = "mle")
Standardized loadings (pattern matrix) based upon correlation matrix
                ML2     ML1     h2     u2   com
CLearn_1     0.7397  0.1414 0.6696 0.3304 1.073
CLearn_2     0.7266 -0.0950 0.4695 0.5305 1.034
CLearn_3     0.7345  0.1164 0.6366 0.3634 1.050
CLearn_4     0.5618  0.2737 0.5411 0.4589 1.450
CLearn_5_R   0.0828  0.6453 0.4755 0.5245 1.033
CLearn_6     0.7201 -0.1488 0.4358 0.5642 1.085
CLearn_7_R  -0.0331  0.8711 0.7317 0.2683 1.003
CLearn_8_R   0.0168  0.8550 0.7454 0.2546 1.001
CLearn_9     0.7245 -0.0741 0.4779 0.5221 1.021
CLearn_10_R  0.0442  0.6044 0.3934 0.6066 1.011

                         ML2    ML1
SS loadings           3.0718 2.5046
Proportion Var        0.3072 0.2505
Cumulative Var        0.3072 0.5576
Proportion Explained  0.5509 0.4491
Cumulative Proportion 0.5509 1.0000

 With factor correlations of 
       ML2    ML1
ML2 1.0000 0.4893
ML1 0.4893 1.0000

Mean item complexity =  1.1
Test of the hypothesis that 2 factors are sufficient.

The degrees of freedom for the null model are  45  and the objective function was  4.9075 with Chi Square of  970.8732
The degrees of freedom for the model are 26  and the objective function was  0.3078 

The root mean square of the residuals (RMSR) is  0.0403 
The df corrected root mean square of the residuals is  0.053 

The harmonic number of observations is  203 with the empirical chi square  29.6666  with prob <  0.2817 
The total number of observations was  203  with Likelihood Chi Square =  60.4873  with prob <  0.0001442 

Tucker Lewis Index of factoring reliability =  0.93507
RMSEA index =  0.08068  and the 90 % confidence intervals are  0.05445 0.10787
BIC =  -77.6561
Fit based upon off diagonal values = 0.9914
Measures of factor score adequacy             
                                                     ML2    ML1
Correlation of (regression) scores with factors   0.9361 0.9410
Multiple R square of scores with factors          0.8763 0.8855
Minimum correlation of possible factor scores     0.7526 0.7711
Code
CL_3 = fa(CogLearning, nfactors=3, rotate = "oblimin", fm="mle")
print(CL_3, digits = 4)
Factor Analysis using method =  ml
Call: fa(r = CogLearning, nfactors = 3, rotate = "oblimin", fm = "mle")
Standardized loadings (pattern matrix) based upon correlation matrix
                ML2     ML1     ML3     h2     u2   com
CLearn_1     0.7377  0.1533 -0.0266 0.6644 0.3356 1.089
CLearn_2     0.7438 -0.0967 -0.1255 0.4907 0.5093 1.092
CLearn_3     0.7755  0.1035 -0.1448 0.6767 0.3233 1.106
CLearn_4     0.5798  0.2685 -0.0583 0.5422 0.4578 1.433
CLearn_5_R   0.1234  0.6265 -0.0914 0.4830 0.5170 1.122
CLearn_6     0.6560 -0.0962  0.3796 0.5901 0.4099 1.653
CLearn_7_R   0.0115  0.8476 -0.0612 0.7315 0.2685 1.011
CLearn_8_R   0.0599  0.8273 -0.0533 0.7348 0.2652 1.019
CLearn_9     0.6672 -0.0249  0.2433 0.5309 0.4691 1.265
CLearn_10_R -0.0381  0.6873  0.3683 0.5796 0.4204 1.538

                         ML2    ML1    ML3
SS loadings           3.0835 2.5305 0.4099
Proportion Var        0.3084 0.2530 0.0410
Cumulative Var        0.3084 0.5614 0.6024
Proportion Explained  0.5119 0.4201 0.0680
Cumulative Proportion 0.5119 0.9320 1.0000

 With factor correlations of 
      ML2     ML1     ML3
ML2 1.000  0.4460  0.1250
ML1 0.446  1.0000 -0.0061
ML3 0.125 -0.0061  1.0000

Mean item complexity =  1.2
Test of the hypothesis that 3 factors are sufficient.

The degrees of freedom for the null model are  45  and the objective function was  4.9075 with Chi Square of  970.8732
The degrees of freedom for the model are 18  and the objective function was  0.1631 

The root mean square of the residuals (RMSR) is  0.0238 
The df corrected root mean square of the residuals is  0.0377 

The harmonic number of observations is  203 with the empirical chi square  10.3759  with prob <  0.9189 
The total number of observations was  203  with Likelihood Chi Square =  31.9328  with prob <  0.02239 

Tucker Lewis Index of factoring reliability =  0.96198
RMSEA index =  0.06155  and the 90 % confidence intervals are  0.02317 0.09633
BIC =  -63.7049
Fit based upon off diagonal values = 0.997
Measures of factor score adequacy             
                                                     ML2    ML1     ML3
Correlation of (regression) scores with factors   0.9401 0.9407  0.7011
Multiple R square of scores with factors          0.8838 0.8848  0.4915
Minimum correlation of possible factor scores     0.7676 0.7697 -0.0169

Important Note

You can change the rotation or factoring method in the code if you want something different:

## Just an example

CL = fa(CogLearning, nfactors=1, rotate = "promax", fm="fa")
print(CL, digits = 4)

Step 7: Grab the Fit Indices

Kline (2016) and Hu and Bentler (1999): Chi-square goodness-of-fit; Root mean square error of approximation (RMSEA) <.08; Standardized root mean square residual (SRMR) <.08; Tucker–Lewis index (TLI) >.90; Comparative fit index (CFI) >.90.

Not significant is ideal, but also not super influential

Code
CL_2$STATISTIC
[1] 60.48726
Code
CL_2$dof
[1] 26
Code
1 - ((CL_2$STATISTIC-CL_2$dof)/
       (CL_2$null.chisq-CL_2$null.dof))
[1] 0.9627516

Grab TLI, RMSEA and CI, SRMR

Code
summary(CL_2, fit.measure=TRUE, standardized=TRUE)

Factor analysis with Call: fa(r = CogLearning, nfactors = 2, rotate = "oblimin", fm = "mle")

Test of the hypothesis that 2 factors are sufficient.
The degrees of freedom for the model is 26  and the objective function was  0.31 
The number of observations was  203  with Chi Square =  60.49  with prob <  0.00014 

The root mean square of the residuals (RMSA) is  0.04 
The df corrected root mean square of the residuals is  0.05 

Tucker Lewis Index of factoring reliability =  0.935
RMSEA index =  0.081  and the 10 % confidence intervals are  0.054 0.108
BIC =  -77.66
 With factor correlations of 
     ML2  ML1
ML2 1.00 0.49
ML1 0.49 1.00

Choosing A Model

Choose the model with the best - or most theoretically appropriate - fit

Step 8: An Iterative Process

  • Check individual item loadings
  • Items that correlate will start to group (load) together
  • Want > .4 on one factor but not others
  • Find poorest item, remove it, and run analysis again!
Code
CL_2 = fa(CogLearning[ -c(4)], nfactors=2, rotate = "oblimin", fm="mle")
print(CL_2, digits = 4)
Factor Analysis using method =  ml
Call: fa(r = CogLearning[-c(4)], nfactors = 2, rotate = "oblimin", 
    fm = "mle")
Standardized loadings (pattern matrix) based upon correlation matrix
                ML2     ML1     h2     u2   com
CLearn_1     0.7576  0.1498 0.7037 0.2963 1.078
CLearn_2     0.7160 -0.0809 0.4644 0.5356 1.026
CLearn_3     0.7168  0.1379 0.6264 0.3736 1.074
CLearn_5_R   0.0843  0.6477 0.4782 0.5218 1.034
CLearn_6     0.7170 -0.1342 0.4411 0.5589 1.070
CLearn_7_R  -0.0361  0.8866 0.7571 0.2429 1.003
CLearn_8_R   0.0141  0.8376 0.7130 0.2870 1.001
CLearn_9     0.6940 -0.0522 0.4501 0.5499 1.011
CLearn_10_R  0.0512  0.6069 0.4003 0.5997 1.014

                         ML2    ML1
SS loadings           2.6488 2.3856
Proportion Var        0.2943 0.2651
Cumulative Var        0.2943 0.5594
Proportion Explained  0.5261 0.4739
Cumulative Proportion 0.5261 1.0000

 With factor correlations of 
       ML2    ML1
ML2 1.0000 0.4732
ML1 0.4732 1.0000

Mean item complexity =  1
Test of the hypothesis that 2 factors are sufficient.

The degrees of freedom for the null model are  36  and the objective function was  4.1316 with Chi Square of  818.7392
The degrees of freedom for the model are 19  and the objective function was  0.1926 

The root mean square of the residuals (RMSR) is  0.0392 
The df corrected root mean square of the residuals is  0.0539 

The harmonic number of observations is  203 with the empirical chi square  22.4339  with prob <  0.2632 
The total number of observations was  203  with Likelihood Chi Square =  37.913  with prob <  0.006088 

Tucker Lewis Index of factoring reliability =  0.95389
RMSEA index =  0.06985  and the 90 % confidence intervals are  0.03657 0.10267
BIC =  -63.0379
Fit based upon off diagonal values = 0.9915
Measures of factor score adequacy             
                                                     ML2    ML1
Correlation of (regression) scores with factors   0.9297 0.9398
Multiple R square of scores with factors          0.8643 0.8831
Minimum correlation of possible factor scores     0.7287 0.7663

Heads Up!

The ‘4’ in the code represents the items place in the string from Step 3. In this case, it would remove ‘V4’.

So, if you want to remove the 6th item, use ‘[ -c(6)]’

This can get confusing if your items are out of order.

CL_2 = fa(CogLearning[ -c(6)], nfactors=1, rotate = "oblimin", fm="mle")
print(CL_2, digits = 4)

Step 9: Create Factors

How many items loaded together? These are your FACTORS

Factor_1 = c(1, 2, 3, 4, 5)
Factor_2 = c(6, 7, 8, 9, 10)

Get Descriptives

Code
CogLearning$ef1 = apply(CogLearning[ , Factor_1], 1, mean)
CogLearning$ef2 = apply(CogLearning[ , Factor_2], 1, mean)
summary(CogLearning)
    CLearn_1        CLearn_2        CLearn_3        CLearn_4    
 Min.   :1.000   Min.   :1.000   Min.   :1.000   Min.   :1.000  
 1st Qu.:2.000   1st Qu.:2.000   1st Qu.:3.000   1st Qu.:2.000  
 Median :3.000   Median :3.000   Median :4.000   Median :3.000  
 Mean   :3.158   Mean   :2.576   Mean   :3.399   Mean   :2.995  
 3rd Qu.:4.000   3rd Qu.:3.000   3rd Qu.:4.000   3rd Qu.:4.000  
 Max.   :5.000   Max.   :5.000   Max.   :5.000   Max.   :5.000  
   CLearn_5_R       CLearn_6       CLearn_7_R     CLearn_8_R       CLearn_9    
 Min.   :1.000   Min.   :1.000   Min.   :1.00   Min.   :1.000   Min.   :1.000  
 1st Qu.:3.000   1st Qu.:2.000   1st Qu.:3.00   1st Qu.:3.000   1st Qu.:2.000  
 Median :4.000   Median :3.000   Median :4.00   Median :3.000   Median :3.000  
 Mean   :3.645   Mean   :2.882   Mean   :3.67   Mean   :3.419   Mean   :2.995  
 3rd Qu.:5.000   3rd Qu.:4.000   3rd Qu.:5.00   3rd Qu.:4.000   3rd Qu.:4.000  
 Max.   :5.000   Max.   :5.000   Max.   :5.00   Max.   :5.000   Max.   :5.000  
  CLearn_10_R         ef1             ef2       
 Min.   :1.000   Min.   :1.000   Min.   :1.000  
 1st Qu.:2.000   1st Qu.:2.600   1st Qu.:2.800  
 Median :3.000   Median :3.200   Median :3.200  
 Mean   :3.197   Mean   :3.155   Mean   :3.233  
 3rd Qu.:4.000   3rd Qu.:3.800   3rd Qu.:3.800  
 Max.   :5.000   Max.   :5.000   Max.   :5.000  
Code
CogLearning$ef1 = apply(CogLearning[ , Factor_1], 1, mean)
CogLearning$ef2 = apply(CogLearning[ , Factor_2], 1, mean)
sd(CogLearning$ef1)
[1] 0.7952304
Code
sd(CogLearning$ef2)
[1] 0.768132

Step 10:

  • Use the compute function to create a new variable in SPSS using the items discovered during FA
  • Compare M and SD for that scale to those obtained here.

Wrap Up!

Remember!

Good researchers support and justify they decisions they made along the way.

What’s Next??

You have successfully put together a quantitative research study. The next step is to present this work to your peers at an academic conference. The content for the final week in the course centers on effective dissemination and presentation of your empirical results. You should leave the day feeling prepared for your final course presentations, as well as any conference presentations you plan to submit to in the future.